H Ardware - Aware E Xponential a Pproximation for D Eep N Eural N Etworks

نویسندگان

  • Xue Geng
  • Jie Lin
  • Bin Zhao
  • Zhe Wang
  • Mohamed M. Sabry
  • Aly
  • Vijay Chandrasekhar
چکیده

In this paper, we address the problem of cost-efficient inference for non-linear operations in deep neural networks (DNNs), in particular, the exponential function e in softmax layer of DNNs for object detection. The goal is to minimize the hardware cost in terms of energy and area, while maintaining the application accuracy. To this end, we introduce Piecewise Linear Function (PLF) for approximating e. First, we derive a theoretical upper bound of the number of pieces required for retaining the detection accuracy. Moreover, we constrain PLF to bounded domain in order to minimize bitwidths of the lookup table of pieces, resulting in lower energy and area cost. The non-differentiable bounded PLF layer can be optimized via the straight-through estimator. ASIC synthesis demonstrates that the hardware-oriented softmax costs 4x less energy and area than the direct lookup table of e, while with comparable performance on benchmark datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

L Ocal E Xplanation M Ethods for D Eep N Eural N Etworks L Ack S Ensitivity to P Arameter V Al - Ues

Explaining the output of a complicated machine learning model like a deep neural network (DNN) is a central challenge in machine learning. Several proposed local explanation methods address this issue by identifying what dimensions of a single input are most responsible for a DNN’s output. The goal of this work is to assess the sensitivity of local explanations to DNN parameter values. Somewhat...

متن کامل

A S Calable L Aplace a Pproximation for N Eural N Etworks

We leverage recent insights from second-order optimisation for neural networks to construct a Kronecker factored Laplace approximation to the posterior over the weights of a trained network. Our approximation requires no modification of the training procedure, enabling practitioners to estimate the uncertainty of their models currently used in production without having to retrain them. We exten...

متن کامل

Iclr 2018 a Ttention - B Ased G Uided S Tructured S Parsity of D Eep N Eural N Etworks

Network pruning is aimed at imposing sparsity in a neural network architecture by increasing the portion of zero-valued weights for reducing its size regarding energy-efficiency consideration and increasing evaluation speed. In most of the conducted research efforts, the sparsity is enforced for network pruning without any attention to the internal network characteristics such as unbalanced out...

متن کامل

B Lack - Box a Ttacks on D Eep N Eural N Etworks via G

In this paper, we propose novel Gradient Estimation black-box attacks to generate adversarial examples with query access to the target model’s class probabilities, which do not rely on transferability. We also propose strategies to decouple the number of queries required to generate each adversarial example from the dimensionality of the input. An iterative variant of our attack achieves close ...

متن کامل

Isk L Andscape a Nalysis for U Nder - Standing D Eep N Eural N Etworks

This work aims to provide comprehensive landscape analysis of empirical risk in deep neural networks (DNNs), including the convergence behavior of its gradient, its stationary points and the empirical risk itself to their corresponding population counterparts, which reveals how various network parameters determine the convergence performance. In particular, for an l-layer linear neural network ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018